Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval
نویسندگان
چکیده
OBJECTIVES The purpose of this study was to investigate the effects of query expansion algorithms for MEDLINE retrieval within a pseudo-relevance feedback framework. METHODS A number of query expansion algorithms were tested using various term ranking formulas, focusing on query expansion based on pseudo-relevance feedback. The OHSUMED test collection, which is a subset of the MEDLINE database, was used as a test corpus. Various ranking algorithms were tested in combination with different term re-weighting algorithms. RESULTS Our comprehensive evaluation showed that the local context analysis ranking algorithm, when used in combination with one of the reweighting algorithms - Rocchio, the probabilistic model, and our variants - significantly outperformed other algorithm combinations by up to 12% (paired t-test; p < 0.05). In a pseudo-relevance feedback framework, effective query expansion would be achieved by the careful consideration of term ranking and re-weighting algorithm pairs, at least in the context of the OHSUMED corpus. CONCLUSIONS Comparative experiments on term ranking algorithms were performed in the context of a subset of MEDLINE documents. With medical documents, local context analysis, which uses co-occurrence with all query terms, significantly outperformed various term ranking methods based on both frequency and distribution analyses. Furthermore, the results of the experiments demonstrated that the term rank-based re-weighting method contributed to a remarkable improvement in mean average precision.
منابع مشابه
Rocchio-Based Relevance Feedback in Video Event Retrieval
This paper investigates methods for user and pseudo relevance feedback in video event retrieval. Existing feedback methods achieve strong performance but adjust the ranking based on few individual examples. We propose a relevance feedback algorithm (ARF) derived from the Rocchio method, which is a theoretically founded algorithm in textual retrieval. ARF updates the weights in the ranking funct...
متن کاملTREC 2005 Genomics Track Experiments at DUTAI
This paper describes the techniques we applied for the two tasks of the TREC Genomics track, i.e., ad hoc retrieval and categorization tasks. For the ad hoc retrieval task, we used query expansion, different scoring strategy on different parts of Medline record (Title, Abstract, RN, MH, etc.) and pseudo relevance feedback. Our submitted run DUTAdHoc2 obtained a MAP of 0.2349. For the categoriza...
متن کاملPseudo-Relevance Feedback and Title Re-Ranking for Chinese Information Retrieval
In our formal runs, we have experimented with the retrieval based on character-based indexing and hybrid term indexing because these are more distinct types of indexing for better pooling. We confirmed that character-based indexing did not produce relatively good retrieval effectiveness. We have also experimented with three new pseudo-relevance feedback (PRF) methods. These new methods were abl...
متن کاملMore Reflections on "Aboutness" TREC-2001 Evaluation Experiments at Justsystem
The TREC-2001 Web track evaluation experiments at the Justsystem site are described with a focus on the “aboutness” based approach in text retrieval. In the web ad hoc task, our TREC-9 approach is adopted again, combining both pseudo-relevance feedback and reference database feedback but the setting is calibrated for an early precision preferred search. For the entry page finding task, we combi...
متن کاملA Cluster Based Pseudo Feedback Technique Which Exploits Good and Bad Clusters
In the last years, cluster based retrieval has been demonstrated as an effective tool for both interactive retrieval and pseudo relevance feedback techniques. In this paper we propose a new cluster based retrieval function which uses the best and worst clusters of a document in the cluster ranking, to improve the retrieval effectiveness. The evaluation shows improvements in some standard TREC c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 17 شماره
صفحات -
تاریخ انتشار 2011